By Alex Hughes

Published: Friday, 01 July 2022 at 12:00 am


AI image generators are having their moment right now. Thanks to OpenAI and their creation known as Dall-E 2, people across the internet have been able to make their own detailed images just from worded prompts.

But quickly after OpenAI’s creation, we then saw Google release a direct competitor, using OpenAI’s open-source code to help create Imagen – an equally as impressive AI image generator, capable of once again making images just from simple phrases.

However, while both of these inventions were revolutionary in the AI world, they were only available to a select few, offering waitlists as they slowly gave access to new users.

Shortly after, the internet exploded with people making their own Dall-E images, albeit at a much lower level of quality. It wasn’t because OpenAI suddenly opened up access but instead because someone had made their own version of the software based heavily on the original, known as Dall-E mini.

We spoke to the creator of Dall-E mini about how it came to be, its viral potential and the future of the project.

What is Dall-E mini and how did it come to be?

""
© Dall-E mini

Dall-E mini is yet another AI image generator taking the internet by storm. However, where it differs is that it is completely free for everyone to use. Despite the near-identical name, it has nothing to do with OpenAI, other than making use of the large amount of publicly-available information OpenAI has provided on their model.

Instead, this project was created by a software engineer known as Boris Dayma. “When I heard about it [Dall-E], I thought that was so cool and that I want to build something like that. So I read their paper on the model, but I would never understand it, it was so complicated,” says Dayma.

It wasn’t until July 2021 that Boris had the chance to try and recreate this project when he signed up for a competition run by Google and Hugging Face, an AI community. He was paired up with a team and given support on his project, where they all decided to try and create an AI image generator like Dall-E.

“By the end of the month, we had something kind of cool. It was not at the level it is now, but it could produce simple prompts like beach at night or day. We won the competition and I continued to work on the product, making improvements since then.”

The model didn’t pick up at first with just a small audience, but around two months ago, the internet picked it up, embracing it for its viral image abilities.

One key difference with Dall-E mini is that it is not filtered at all due to the smaller team and free-to-use nature. This means that in comparison to Google’s Imagen and OpenAI’s Dall-E 2, which have safety protocols, any prompt will be accepted. This means people are able to use Dall-E mini for everything from cartoons performing a Ted Talk and celebrities playing Quidditch, to uses of racism, extreme violence or depictions of real-world traumatic situations.

""
A Ted Talk with an expert host © Dall-E mini

Going viral

With this free service going viral online, there were suddenly a lot more people than just Boris using the platform. His main takeaway was the creativity of its newfound users.

“I would write something like a view of a lake under the moonlight, or Eiffel Tower on the moon and these were my most complex prompts. But when I see what people use it for, I’m amazed. I don’t have that level of creativity and they learn how to tweak the model to create really specific prompts that I could never come up with,” says Boris.

He has even taken to scrolling through Twitter when he needs to relax, checking out what people can create. He has a particular fondness for the use of the term ‘trail cam’, creating grainy images that look like they have come from a low-res camera at night.

""
Tinky-Winky, LaLa and Poe have a nightmare © Dall-E mini

Blurred faces and creative inputs

Despite the model’s popularity, it isn’t without its limits. Compared to OpenAI’s original model, or Google’s more recent Imagen, Dall-E mini clearly struggles to match in terms of image quality.

While any term will likely produce a result that matches, no matter how niche, you could find yourself squinting to see the comparison. Celebrities and cartoon characters can often come out as blobs that vaguely resemble the original, and an even weirder issue, the model really can’t do faces.

“The image is encoded into a very shot sequence of numbers so that the model can learn faster. Because of this, the model makes a lot of mistakes. However, when you draw the Moon, a landscape or a tree, you don’t really notice the issues there.

“When it is on a face, we pay a lot more attention. If the eyes are out of order or the nose is misshaped, it is weird. It is the same on animals and cartoon characters, it’s just something we pay more attention to than misshaped objects. Really, the model is equally good or bad at everything.”

This doesn’t mean that the model is incapable of making faces, it simply requires a lot of work on the user’s part. Some have found ways to force the model to create a face by writing long and detailed prompts, listing the size and location of each part of the face.

""
You shall not pass… the juice © Dall-E mini

Dealing with the huge numbers and the future of Dall-E mini 

While the free nature of Dall-E mini is what makes it stand out, it isn’t without its limits. Compared to OpenAI’s queue system, offering access to a few thousand here and there, Dall-E mini was instantly available to everyone.

“The number of people using it is crazy right now. As it became viral, I made small changes to make it more efficient and then I could handle more traffic, but then the traffic would increase again, and I could never keep up.

“I’m looking to scale it up with more servers and be able to adapt. Little by little we’re able to support more traffic and hopefully in the future, traffic won’t be an issue.”

However, with more scales and growth, Boris is now asking the same question that both OpenAI and Google will be questioning – whether this keep going without any financial aid or monetisation.

“I think monetisation is important. I want to be able to make it scalable so everyone use it now and it is very important to me to make it free for everyone to use. My goal is for this to be a self-sustainable project that everyone can use for fee.”

Read more: